Support multi-tenant RAM buffers for IndexWriter #13951

mdmarshmallow · 2024-10-23T22:57:48Z

Description

Draft PR to outline my initial approach. I introduced IndexWriterRamManager to control writer flushes. I also have a function IndexWriterRamManager#chooseWriterToFlush that lets the user choose which writer they specifically want to be flushed, sort of like a simple flush policy. Currently it defaults to just choosing the writer with the most RAM usage.

One thing I wanted to avoid was starting another thread to just poll the IndexWriter memory usage, so I just added a listener in IndexWriter#maybeProcessEvents which is called whenever docs are updated or deleted (IndexWriterRamManager#flushIfNecessary). I wanted this method to be called whenever FlushPolicy#onChange was called, as I believe they both kinda do the same thing

No unit tests yet, but will add them! I just wanted to have my approach sanity checked for now

Revision 2

Failing some unit tests but changed the general approach to address some of the feedback I received
Still need to write new unit tests to test this code
Now this will always create a IndexWriterRAMManager that is tied to the IndexWriterConfig. The user can either pass in their own if they want multiple IndexWriter instances to share a single buffer or just let the IndexWriterRAMManager be created per IndexWriter

Revision 3

Fixed unit tests

Revision 4

Added unit tests and fixed some bugs

jpountz

Thanks for looking into it. It feels like there should be closer integration between this and the existing FlushPolicy/FlushByRamOrCountsPolicy?

I think we'll also want to look into the performance overhead of this. It's likely not good to iterate over tens of IndexWriters on every indexed document to check if the total buffer usage is above the limit?

jpountz · 2024-10-24T14:43:07Z

lucene/core/src/java/org/apache/lucene/index/IndexWriter.java

   * @throws IOException if the directory cannot be read/written to, or if it does not exist and
   *     <code>conf.getOpenMode()</code> is <code>OpenMode.APPEND</code> or if there is any other
   *     low-level IO error
   */
-  public IndexWriter(Directory d, IndexWriterConfig conf) throws IOException {
+  public IndexWriter(
+      Directory d, IndexWriterConfig conf, IndexWriterRAMManager indexWriterRAMManager)


It should be on the IndexWriterConfig rather than another IndexWriter ctor argument.

Sure that makes sense to me

jpountz · 2024-10-24T14:45:15Z

lucene/core/src/java/org/apache/lucene/index/IndexWriterRAMManager.java

+
+/**
+ * For managing multiple instances of {@link IndexWriter} sharing the same buffer (configured by
+ * {@link IndexWriterConfig#setRAMBufferSizeMB})


It should be the other way around in my opinion, the RAM buffer size should be on IndexWriterRAMManager, and setting a ramBufferSizeMB on IndexWriterConfig would internally create a new IndexWriterRAMManager under the hood that is shared with no other IndexWriter.

Sorry I'm probably missing something here - so I get what you're saying about having the IndexWriterRAMManager in the IndexWriterConfig, but what would be the point of creating an IndexWriterRAMManager for a single IndexWriter? Wouldn't DocumentWriterFlushControl be sufficient for this case?

but what would be the point of creating an IndexWriterRAMManager for a single IndexWriter

I think the idea is to be able to create IndexWriters that don't share their RAM buffer limit with other writers. Maybe we could just set IndexWriterRAMManager to null, if ramBufferSizeMB is explicitly specified in the IW config (instead of creating a new ram manager).

Hmm I took a slightly different approach, will publish a new PR soon. Maybe we can discuss it there but pretty much I just do what @jpountz suggested and move the ramBufferSizeMB to be a value held by the IndexWriterRAMManager. I think we can then discuss if we should just disable calling writer flushes in the manager if there is only a single writer.

jpountz · 2024-10-24T14:49:16Z

lucene/core/src/java/org/apache/lucene/index/IndexWriterRAMManager.java

+
+  /**
+   * Chooses which writer should be flushed. Default implementation chooses the writer with most RAM
+   * usage


FWIW we ran benchmarks in Elasticsearch, and the approach that worked the best was to flush IndexWriters in a round-robin fashion. This is not intuitive at first sight, but I believe that it works better in practice because it is more likely to flush DWPTs that are little used, and also because otherwise you favor IndexWriters that do little indexing over IndexWriters that do heavy indexing (and are thus more likely to have a large buffer).

Hmm that's interesting! I can make that the default implementation then

mdmarshmallow · 2024-10-24T23:03:03Z

Thanks for looking into it. It feels like there should be closer integration between this and the existing FlushPolicy/FlushByRamOrCountsPolicy?

So I actually had the same thought, but when I took a look at FlushPolicy, it seemed very geared towards DW/DWPT, IE:

  public abstract void onChange(
      DocumentsWriterFlushControl control, DocumentsWriterPerThread perThread);

I couldn't think of a clean way to integrate the two... but I'll give it some more thought

I think we'll also want to look into the performance overhead of this. It's likely not good to iterate over tens of IndexWriters on every indexed document to check if the total buffer usage is above the limit?

Ah I thought that the ramBytesUsed() call was cheap but looks like AtomicLong#get() is called under the hood so I'll see if there's a better way to do this. I agree though that performance testing this change makes sense - I haven't quite thought about how to do it though, maybe writing something in luceneutil?

jpountz · 2024-10-25T12:09:17Z

I couldn't think of a clean way to integrate the two... but I'll give it some more thought

For what it's worth, these classes are package-private, so we can feel free to change their API.

vigyasharma · 2024-10-29T06:33:54Z

lucene/core/src/java/org/apache/lucene/index/IndexWriterRAMManager.java

+   * @return the IndexWriter to flush
+   */
+  protected static IndexWriter chooseWriterToFlush(
+      Collection<IndexWriter> writers, IndexWriter callingWriter) {


Do we need the calling writer here?

I don't use it but I left it there in case someone wants to override the FlushPolicy in such a way that they might need the calling writer, IE: just flush the calling writer every time

vigyasharma · 2024-10-29T06:37:43Z

lucene/core/src/java/org/apache/lucene/index/IndexWriterRAMManager.java

+          "Writer " + id + " has not been registered or has been removed already");
+    }
+    long totalRam = 0L;
+    for (IndexWriter writer : idToWriter.values()) {


Would it make sense to cache ramByesUsed() and only update it for the calling writer?

Yeah this makes sense, I didn't realize that ramBytesUsed() is not a necessarily cheap operation. Changed it in the next rev

mdmarshmallow · 2024-10-30T03:41:09Z

Thanks for all the comments guys, I've been pretty busy with some life things/work, but hopefully I'll put out another update by tomorrow!

vigyasharma

Sorry it took so long to get to this PR. I have some code refactoring comments that should help simplify the changes.

vigyasharma · 2024-11-21T04:18:10Z

lucene/core/src/java/org/apache/lucene/index/IndexWriterRAMManager.java

+    }
+  }
+
+  private static class LinkedIdToWriter {


How about using a java Queue implementation instead of the custom linked-list logic? You could round-robin on elements by removing, processing and add them back to the queue.

I suppose this queue size would be small, so array deque and linked lists are both fine? We can also get some thread safe implementations out of the box.

Hmm, so I looked into using both Queue and LinkedHashMap to do this. The issue I found was that there was still complexity in maintaining the last id that was flushed and the total ram used, so the only function that really became less complex was the addWriter function. I think given that this probably won't simplify the overall function too much, I'm inclined the keep the implementation the same way it is right now, but if you disagree feel free to let me know and I can take a second look at it.

vigyasharma · 2024-11-21T04:35:15Z

lucene/core/src/java/org/apache/lucene/index/IndexWriterRAMManager.java

+  /**
+   * For use in {@link IndexWriter}, manages communication with the {@link IndexWriterRAMManager}
+   */
+  public static class PerWriterIndexWriterRAMManager {


Do we really need this class? IndexWriter (and FlushPolicy too) should already have a reference to the ram manager via IndexWriterConfig.

How about we just store the writer's "ramManagerID" inside IndexWriter, and use it to invoke ram manager APIs directly? In fact, can we work with the writer object directly instead of keeping these "id" mappings? Similar to how FlushPolicy accepts the calling DWPT in its APIs...

Yeah I think that makes sense, I'll remove it

vigyasharma · 2024-11-21T04:37:19Z

lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java

@@ -142,7 +142,21 @@ public IndexWriterConfig() {
   * problem you should switch to {@link LogByteSizeMergePolicy} or {@link LogDocMergePolicy}.
   */
  public IndexWriterConfig(Analyzer analyzer) {
-    super(analyzer);
+    this(analyzer, new IndexWriterRAMManager(IndexWriterConfig.DEFAULT_RAM_BUFFER_SIZE_MB));


Any reason for not making this change in the default constructor? We could avoid making changes to all the tests.

I think the reason I had to make all the changes in the test was cause I added. this constructor:

public IndexWriterConfig(IndexWriterRAMManager indexWriterRAMManager) {

And then in the tests there were a bunch of new IndexWriterConfig(null) calls which became ambiguous. I think that this constructor is potentially useful which is why I took the hit and changed all those tests, but I can remove it to avoid all those test changes?

mdmarshmallow · 2024-12-04T19:22:53Z

Just ran some benchmarks from luceneutil and saw a pretty significant slow down in indexing throughput (21 GB/hour -> 16 GB/hour)... trying to figure out why

Edit: OK it seems that luceneutil(at least with the normal localrun configuration and -r enabled) will index the candidate first and then the baseline, and for some reason the baseline run is much faster. I tried running with both baseline and candidate pointing to my changed version and saw the same "slow down", so I think there is no regression at least in the single IndexWriter case (which is what I expected as this should be a no-op in that case).

mikemccand · 2024-12-05T12:10:11Z

and for some reason the baseline run is much faster

Ugh ... "who goes first" bias? Maybe the line docs files was cold on your first run? Though it's surprising that'd make such a big difference overall (21 -> 16) since that usage (reading single file sequentially) is normally well optimized by OS readahead ...

Could you please open a luceneutil issue? Let's get to the bottom of this ... we recently fixed a similar "who goes first" bias (WGF bias?) but for the searcher tasks specifically I think.

mdmarshmallow · 2024-12-06T01:26:53Z

Sure @mikemccand , added an issue here.

I also was able to test my change by just doing two separate runs and manually changing the test to point at whichever version I needed. My test setup involved hacking Indexer#java in luceneutil a bit to run 2 index writers instead of one. For mainline, I gave both index writers a 100 MB buffer, and for the candidate, I gave the ram manager a 200 MB buffer and passed that into the index writers instead. Both index writers were indexing the exact same dataset (wikimedium1m). I found that there was pretty much no change in indexing throughput (20.5 GB/hour for the candidate and 20.1 GB/hour for baseline). I can post more detailed results if anyone is interested.

I realize that this is not a very realistic scenario, so I will try benchmarking with more writers and also skew the indexing rates of each writer to see how the change performs.

Edit:

Tested with 5 writers indexing at the same rate and saw ~19.3 GB/hour in both the static split and the the ram manager
Also tested with 5 index writers at variable loads, with one indexer indexing 100% of the documents and the rest indexing 20% each. I also saw no change here with both rates ending up being around ~20.5 GB/hour for the 100% document indexer.

mikemccand · 2024-12-11T12:35:27Z

@mdmarshmallow thank you for benchmarking this change! I know that is tricky and luceneutil is not exactly simple to use even when you don't have to make code changes ;)

I think it's fine that we can't measure a performance gain? Achieving the same performance as N (2 or 5 in your tests) fully separate IndexWriters is great (Hippocratic Oath: "first do no harm")! The goal here is to provide a simple way for users to dynamically share a limited RAM buffer with N IndexWriters...

The only case I'd expect to see a performance win is if the rate of indexing is very unbalanced across the five shards ... e.g. you have fiveIndexWriters but only one of them is actively indexing at red-line (saturate CPU) throughput. With your change, that IW should be able to use (nearly) all of the RAM buffer, but prior to your change users had to do something coarse like divide RAM buffer statically up-front by five.

mdmarshmallow mentioned this pull request Oct 23, 2024

Support multi-tenant RAM buffers for IndexWriter #13913

Open

mdmarshmallow changed the title ~~Multi-tenant index writer initial commit~~ Support multi-tenant RAM buffers for IndexWriter #13913 Oct 23, 2024

mdmarshmallow changed the title ~~Support multi-tenant RAM buffers for IndexWriter #13913~~ Support multi-tenant RAM buffers for IndexWriter Oct 23, 2024

jpountz reviewed Oct 24, 2024

View reviewed changes

vigyasharma reviewed Oct 29, 2024

View reviewed changes

mdmarshmallow force-pushed the multi-tenant-iw-ram branch from 2c6ed50 to 707d214 Compare October 31, 2024 00:45

vigyasharma reviewed Nov 21, 2024

View reviewed changes

Marc D'Mello added 6 commits November 28, 2024 00:10

Multi-tenant index writer initial commit

a53549a

Addressed feedback

7ebccc6

Fixed some bugs

da6d13a

Added unit tests + fixed some bugs

d72cad7

Fixed randomized unit test

80ea482

Removed PerWriterIndexRamManager

1657e49

mdmarshmallow mentioned this pull request Dec 6, 2024

Index is slower than reindex in localrun.py mikemccand/luceneutil#320

Open

mdmarshmallow force-pushed the multi-tenant-iw-ram branch from 92aa344 to 4bf3b2a Compare December 10, 2024 22:46

Made this a complete no-op if IndexWriterRAMManager is not specified

00ec37f

mdmarshmallow force-pushed the multi-tenant-iw-ram branch from 4bf3b2a to 00ec37f Compare December 10, 2024 22:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multi-tenant RAM buffers for IndexWriter #13951

Support multi-tenant RAM buffers for IndexWriter #13951

mdmarshmallow commented Oct 23, 2024 •

edited

Loading

jpountz left a comment

jpountz Oct 24, 2024

mdmarshmallow Oct 25, 2024

jpountz Oct 24, 2024

mdmarshmallow Oct 25, 2024

vigyasharma Oct 29, 2024

mdmarshmallow Oct 30, 2024

jpountz Oct 24, 2024

mdmarshmallow Oct 25, 2024

mdmarshmallow commented Oct 24, 2024 •

edited

Loading

jpountz commented Oct 25, 2024

vigyasharma Oct 29, 2024 •

edited

Loading

mdmarshmallow Oct 30, 2024

vigyasharma Oct 29, 2024

mdmarshmallow Oct 30, 2024

mdmarshmallow commented Oct 30, 2024

vigyasharma left a comment

vigyasharma Nov 21, 2024

mdmarshmallow Nov 27, 2024

vigyasharma Nov 21, 2024

mdmarshmallow Nov 27, 2024

vigyasharma Nov 21, 2024

mdmarshmallow Nov 27, 2024

mdmarshmallow commented Dec 4, 2024 •

edited

Loading

mikemccand commented Dec 5, 2024

mdmarshmallow commented Dec 6, 2024 •

edited

Loading

mikemccand commented Dec 11, 2024

Support multi-tenant RAM buffers for IndexWriter #13951

Are you sure you want to change the base?

Support multi-tenant RAM buffers for IndexWriter #13951

Conversation

mdmarshmallow commented Oct 23, 2024 • edited Loading

Description

Revision 2

Revision 3

Revision 4

jpountz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdmarshmallow commented Oct 24, 2024 • edited Loading

jpountz commented Oct 25, 2024

vigyasharma Oct 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdmarshmallow commented Oct 30, 2024

vigyasharma left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdmarshmallow commented Dec 4, 2024 • edited Loading

mikemccand commented Dec 5, 2024

mdmarshmallow commented Dec 6, 2024 • edited Loading

mikemccand commented Dec 11, 2024

mdmarshmallow commented Oct 23, 2024 •

edited

Loading

mdmarshmallow commented Oct 24, 2024 •

edited

Loading

vigyasharma Oct 29, 2024 •

edited

Loading

mdmarshmallow commented Dec 4, 2024 •

edited

Loading

mdmarshmallow commented Dec 6, 2024 •

edited

Loading